2024 iThome 鐵人賽

DAY 17

Python

Python和R入門語法比較系列第 17 篇

09 [python] 表格 dataframe.insert插入欄位和字串處理 Series.str.split [16th 鐵人 Day 17]

16th鐵人賽

carplee

團隊為你抓鯉魚

2024-09-30 21:23:47

508 瀏覽

分享至

點我下載：song_rank2.csv

讀檔

import pandas as pd

with open('data/song_rank2.csv') as f:
    p2 = pd.read_csv(f)

p2

新增 Artist 1, Artist 2 欄位

1. 新增一條Series 與 Artist同筆數

p2.Artist

    0           五月天 阿信
    1         魏嘉瑩, 魏如昀
    2        陳芳語 , 茄子蛋
    3          蕭敬騰, 馬佳
    4             吳汶芳 
    5     琳誼 Ring, 許富凱
    6             張語噥 
    7          Ray 黃霆睿
    8            飛兒樂團 
    9          摩登兄弟劉宇寧
    10          五月天 阿信
    11          五月天 阿信
    12        魏嘉瑩, 魏如昀
    13       陳芳語 , 茄子蛋
    Name: Artist, dtype: object

[0]*14

    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

[0]*len(p2.Artist)

    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

pd.Series( )

n = len(p2.Artist)
Artist2 = pd.Series([0]*n)

Artist2

    0     0
    1     0
    2     0
    3     0
    4     0
    5     0
    6     0
    7     0
    8     0
    9     0
    10    0
    11    0
    12    0
    13    0
    dtype: int64

2. 插入欄位的方法: dataframe.insert(index, col_name, values)

p2.insert(5, 'Artist2', Artist2)

p2

3. 處理Artist的人名

p2.Artist

    0           五月天 阿信
    1         魏嘉瑩, 魏如昀
    2        陳芳語 , 茄子蛋
    3          蕭敬騰, 馬佳
    4             吳汶芳 
    5     琳誼 Ring, 許富凱
    6             張語噥 
    7          Ray 黃霆睿
    8            飛兒樂團 
    9          摩登兄弟劉宇寧
    10          五月天 阿信
    11          五月天 阿信
    12        魏嘉瑩, 魏如昀
    13       陳芳語 , 茄子蛋
    Name: Artist, dtype: object

Artist
魏嘉瑩, 魏如昀

Artist1	Artist2
魏嘉瑩	魏如昀

string的分割: string.split( )

p2.Artist[1]

    '魏嘉瑩, 魏如昀'

p2.Artist[1].split(',')

    ['魏嘉瑩', ' 魏如昀']

p2.Artist[1].split(',')[0]

    '魏嘉瑩'

p2.Artist[1].split(',')[1]

    ' 魏如昀'

空格的去除: string.strip( )

p2.Artist[1].split(',')[1].strip()

    '魏如昀'

Series自身的.split( ) 和 .strip( )

Series.str.split( , expand=True)

p2.Artist.str.split(',', expand=True) # expand=True 分開的東西再創一個欄位

取得第0欄

p2.Artist.str.split(',', expand=True)[1]

    0     None
    1      魏如昀
    2      茄子蛋
    3       馬佳
    4     None
    5      許富凱
    6     None
    7     None
    8     None
    9     None
    10    None
    11    None
    12     魏如昀
    13     茄子蛋
    Name: 1, dtype: object

第0欄和第1欄分別存成art1, art2

art1 = p2.Artist.str.split(',', expand=True)[0]

art2 = p2.Artist.str.split(',', expand=True)[1]

string.split( )

傳統寫法

for i in p2.Artist:
print(i.split(','))

4. 插入: dataframe.insert(index, col_name, values)

art1 插入第4欄

p2.insert(4, 'art1', art1)

art2 插入第5欄

p2.insert(5,'art2', art2)

p2

移除欄位: dataframe.drop(col_name, axis=columns)

p2 =p2.drop(columns='Artist2')

p2

內容預告：

09 [python] 表格 dataframe.insert插入欄位和字串處理 Series.str.split

10 [python] pandas的欄列選擇工具 dataframe.loc[ ]和.iloc[ ]

10 [R] r的dataframe欄列選擇方式

11 取得欄位位置

12 布林值和表格條件選取

13 畫長條圖統計

14 [Python] for迴圈和 matplotlib.pyplot 畫線圖

14 [R]for迴圈和 ggplot 畫線圖

15 [Python] for 迴圈和 html網頁資料解析 by bs4套件(BeautifulSoup)

15 [R] for 迴圈和 html網頁資料解析 httr, xml2

08 [python] 用Regular Expression(正規表示法)處理文字 [16th 鐵人 Day 16]

09 [R] 表格 dataframe dplyr::mutate()插入欄位和字串處理strsplit() [16th 鐵人 Day 18]

系列文

Python和R入門語法比較共 30 篇

RSS系列文訂閱系列文

2 人訂閱

完整目錄

熱門推薦

{{ item.channelVendor }} | {{ item.webinarstarted }} |

直播中

尚未有邦友留言

立即登入留言

參賽組數

902 組

團體組數

37 組

累計文章數

19838 篇

完賽人數

528 人

15th鐵人賽 16th鐵人賽 13th鐵人賽 14th鐵人賽 17th鐵人賽 12th鐵人賽 11th鐵人賽鐵人賽 2019鐵人賽 javascript 2018鐵人賽 python 2017鐵人賽 windows php c# linux windows server css react

IT邦幫忙

Python和R入門語法比較系列 第 17 篇

09 [python] 表格 dataframe.insert插入欄位 和 字串處理 Series.str.split [16th 鐵人 Day 17]

讀檔